{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"https://www.youtube.com/watch?v=Ua_ToM-CG5Q&list=PLxqBkZuBynVQEvXfJpq3smfuKq3AiNW-N&index=11\"><h1 style=\"font-size:250%; font-family:cursive; color:#ff6666;\"><b>Link to my YouTube Video</b></h1></a>\n",
    "\n",
    "[![IMAGE ALT TEXT](https://imgur.com/k6tXdtA.png)](https://bit.ly/3mXnKGH \"Decoding strategies while generating text with GPT-2\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## First What is BERT?\n",
    "\n",
    "BERT stands for Bidirectional Encoder Representations from Transformers. The name itself gives us several clues to what BERT is all about.\n",
    "\n",
    "BERT architecture consists of several Transformer encoders stacked together. Each Transformer encoder encapsulates two sub-layers: a self-attention layer and a feed-forward layer.\n",
    "\n",
    "### There are two different BERT models:\n",
    "\n",
    "- BERT base, which is a BERT model consists of 12 layers of Transformer encoder, 12 attention heads, 768 hidden size, and 110M parameters.\n",
    "\n",
    "- BERT large, which is a BERT model consists of 24 layers of Transformer encoder,16 attention heads, 1024 hidden size, and 340 parameters.\n",
    "\n",
    "\n",
    "\n",
    "BERT Input and Output\n",
    "BERT model expects a sequence of tokens (words) as an input. In each sequence of tokens, there are two special tokens that BERT would expect as an input:\n",
    "\n",
    "- [CLS]: This is the first token of every sequence, which stands for classification token.\n",
    "- [SEP]: This is the token that makes BERT know which token belongs to which sequence. This special token is mainly important for a next sentence prediction task or question-answering task. If we only have one sequence, then this token will be appended to the end of the sequence.\n",
    "\n",
    "\n",
    "It is also important to note that the maximum size of tokens that can be fed into BERT model is 512. If the tokens in a sequence are less than 512, we can use padding to fill the unused token slots with [PAD] token. If the tokens in a sequence are longer than 512, then we need to do a truncation.\n",
    "\n",
    "And that’s all that BERT expects as input.\n",
    "\n",
    "BERT model then will output an embedding vector of size 768 in each of the tokens. We can use these vectors as an input for different kinds of NLP applications, whether it is text classification, next sentence prediction, Named-Entity-Recognition (NER), or question-answering.\n",
    "\n",
    "\n",
    "------------\n",
    "\n",
    "**For a text classification task**, we focus our attention on the embedding vector output from the special [CLS] token. This means that we’re going to use the embedding vector of size 768 from [CLS] token as an input for our classifier, which then will output a vector of size the number of classes in our classification task.\n",
    "\n",
    "-----------------------\n",
    "\n",
    "![Imgur](https://imgur.com/NpeB9vb.png)\n",
    "\n",
    "-------------------------"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "------\n",
    "\n",
    "## Greedy Search\n",
    "\n",
    "## Beam search\n",
    "\n",
    "## Random Sampling (which includes Top-k and Top-p (nucleus) sampling )\n",
    "\n",
    "------"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All of the above Decoding methods can be used for auto-regressive language generation (here a refresher). In short, auto-regressive language generation is based on the assumption that the probability distribution of a word sequence can be decomposed into the product of conditional next word distributions:\n",
    "\n",
    "![](assets/2022-09-06-22-16-01.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from transformers import AutoTokenizer, AutoModelForCausalLM\n",
    "\n",
    "# device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
    "device = 'cpu'\n",
    "\n",
    "model_name = 'gpt2-medium'\n",
    "\n",
    "tokenizer = AutoTokenizer.from_pretrained(model_name)\n",
    "model = AutoModelForCausalLM.from_pretrained(model_name).to(device)\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Greedy Search Decoding\n",
    "\n",
    "The simplest decoding method to get discrete tokens from a model’s continuous\n",
    "output is to greedily select the token with the highest probability at each\n",
    "timestep:\n",
    "\n",
    "![](assets/2022-08-30-20-06-33.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Input</th>\n",
       "      <th>Choice 1</th>\n",
       "      <th>Choice 2</th>\n",
       "      <th>Choice 3</th>\n",
       "      <th>Choice 4</th>\n",
       "      <th>Choice 5</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Bitcoin will be</td>\n",
       "      <td>the (10.88)</td>\n",
       "      <td>a (8.09)</td>\n",
       "      <td>used (3.84)</td>\n",
       "      <td>able (2.94)</td>\n",
       "      <td>an (1.46)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Bitcoin will be the</td>\n",
       "      <td>first (9.30)</td>\n",
       "      <td>most (5.43)</td>\n",
       "      <td>next (5.40)</td>\n",
       "      <td>currency (4.40)</td>\n",
       "      <td>new (3.12)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Bitcoin will be the first</td>\n",
       "      <td>cryptocurrency (12.49)</td>\n",
       "      <td>to (8.87)</td>\n",
       "      <td>currency (7.59)</td>\n",
       "      <td>digital (6.98)</td>\n",
       "      <td>major (6.23)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Bitcoin will be the first cryptocurrency</td>\n",
       "      <td>to (51.52)</td>\n",
       "      <td>that (9.00)</td>\n",
       "      <td>with (3.00)</td>\n",
       "      <td>, (2.95)</td>\n",
       "      <td>in (1.99)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Bitcoin will be the first cryptocurrency to</td>\n",
       "      <td>be (8.03)</td>\n",
       "      <td>have (6.58)</td>\n",
       "      <td>reach (3.63)</td>\n",
       "      <td>use (2.96)</td>\n",
       "      <td>achieve (2.92)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Bitcoin will be the first cryptocurrency to be</td>\n",
       "      <td>listed (5.83)</td>\n",
       "      <td>accepted (3.71)</td>\n",
       "      <td>backed (3.20)</td>\n",
       "      <td>launched (3.19)</td>\n",
       "      <td>released (3.01)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Bitcoin will be the first cryptocurrency to be...</td>\n",
       "      <td>on (76.07)</td>\n",
       "      <td>in (8.06)</td>\n",
       "      <td>by (2.82)</td>\n",
       "      <td>and (2.08)</td>\n",
       "      <td>as (1.84)</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Bitcoin will be the first cryptocurrency to be...</td>\n",
       "      <td>the (36.40)</td>\n",
       "      <td>a (9.31)</td>\n",
       "      <td>Nas (7.07)</td>\n",
       "      <td>an (5.18)</td>\n",
       "      <td>exchanges (3.12)</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                               Input                 Choice 1  \\\n",
       "0                                    Bitcoin will be              the (10.88)   \n",
       "1                                Bitcoin will be the             first (9.30)   \n",
       "2                          Bitcoin will be the first   cryptocurrency (12.49)   \n",
       "3           Bitcoin will be the first cryptocurrency               to (51.52)   \n",
       "4        Bitcoin will be the first cryptocurrency to                be (8.03)   \n",
       "5     Bitcoin will be the first cryptocurrency to be            listed (5.83)   \n",
       "6  Bitcoin will be the first cryptocurrency to be...               on (76.07)   \n",
       "7  Bitcoin will be the first cryptocurrency to be...              the (36.40)   \n",
       "\n",
       "           Choice 2          Choice 3          Choice 4           Choice 5  \n",
       "0          a (8.09)       used (3.84)       able (2.94)          an (1.46)  \n",
       "1       most (5.43)       next (5.40)   currency (4.40)         new (3.12)  \n",
       "2         to (8.87)   currency (7.59)    digital (6.98)       major (6.23)  \n",
       "3       that (9.00)       with (3.00)          , (2.95)          in (1.99)  \n",
       "4       have (6.58)      reach (3.63)        use (2.96)     achieve (2.92)  \n",
       "5   accepted (3.71)     backed (3.20)   launched (3.19)    released (3.01)  \n",
       "6         in (8.06)         by (2.82)        and (2.08)          as (1.84)  \n",
       "7          a (9.31)        Nas (7.07)         an (5.18)   exchanges (3.12)  "
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\n",
    "import pandas as pd\n",
    "\n",
    "time_steps = 8\n",
    "choices_per_step = 5\n",
    "\n",
    "def get_next_token_greedy_search(input_txt, input_ids):\n",
    "    \"\"\"\n",
    "    Performs greedy search for the next tokens in a text sequence using a trained model.\n",
    "\n",
    "    Args:\n",
    "        input_txt (str): The input text sequence.\n",
    "        input_ids (torch.Tensor): The input tensor containing tokenized text sequence.\n",
    "\n",
    "    Returns:\n",
    "        pd.DataFrame: DataFrame containing the decoding iterations and predicted token choices.\n",
    "\n",
    "    \"\"\"\n",
    "    iterations = []\n",
    "    # We run the decoding for eight timesteps.    \n",
    "    with torch.no_grad():\n",
    "        for _ in range(time_steps):\n",
    "            iteration = dict()\n",
    "            iteration[\"Input\"] = tokenizer.decode(input_ids[0])\n",
    "            output = model(input_ids=input_ids)\n",
    "            # print('output.logits ', output.logits)\n",
    "            # output.logits is 3-D Tensor\n",
    "            # Select logits of the first batch and the last token\n",
    "            next_token_logits = output.logits[0, -1, :]\n",
    "            # next_token_logits is a 1-D Tensor\n",
    "            # print(next_token_logits)\n",
    "            # tensor([-100.3290,  -99.9514, -105.3466,  ..., -108.7789, -104.5404,-100.8237])\n",
    "            \n",
    "            # Now apply softmax\n",
    "            next_token_probabilities = torch.softmax(next_token_logits, dim=-1)\n",
    "            \n",
    "            # torch.argsort => Returns the indices that sort a tensor along a given dimension\n",
    "            sorted_indices_of_next_token_proba = torch.argsort(next_token_probabilities, dim=-1, descending=True)\n",
    "            # print('sorted_indices_of_next_token_proba ', sorted_indices_of_next_token_proba) # tensor([  262,   257,   973,  ..., 42300, 41974, 39500])\n",
    "            # print('sorted_indices_of_next_token_proba ', sorted_indices_of_next_token_proba.shape) # torch.Size([50257])\n",
    "            # print('next_token_probabilities ', next_token_probabilities.shape) # torch.Size([50257])\n",
    "            # in total, there are 50,257 tokens in GPT-2’s vocabulary\n",
    "            # so both 'next_token_probabilities' and 'sorted_indices_of_next_token_proba' have the same shape of torch.Size([50257])\n",
    "            \n",
    "            # Store tokens with the top-most 5 highest probabilities\n",
    "            for choice_idx in range(choices_per_step):\n",
    "                token_index_sorted = sorted_indices_of_next_token_proba[choice_idx]\n",
    "                # print(\"token_index_sorted \", token_index_sorted) # tensor(262)\n",
    "                # So `next_token_probabilities[262]` will give me tensor(0.1088)\n",
    "                token_prob = next_token_probabilities[token_index_sorted].cpu().numpy()\n",
    "                \n",
    "                # Create a string with decoded text and corresponding probability\n",
    "                token_choice = (\n",
    "                    f\"{tokenizer.decode(token_index_sorted)} ({100 * token_prob:.2f}%)\"\n",
    "                )\n",
    "                iteration[f\"Choice {choice_idx+1}\"] = token_choice\n",
    "            # Append predicted next token to input\n",
    "            input_ids = torch.cat([input_ids, sorted_indices_of_next_token_proba[None, 0, None]], dim=-1)\n",
    "            iterations.append(iteration)\n",
    "            # print(iterations)\n",
    "            \n",
    "    return pd.DataFrame(iterations)\n",
    "\n",
    "input_txt = \"Bitcoin will be\"\n",
    "input_ids = tokenizer(input_txt, return_tensors=\"pt\")[\"input_ids\"].to(device)\n",
    "\n",
    "get_next_token_greedy_search(input_txt, input_ids)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model outputs\n",
    "\n",
    "https://huggingface.co/docs/transformers/main/main_classes/output#model-outputs\n",
    "\n",
    "\n",
    "### The `output.logits` is the non-normalized probability for each class (i.e. logits). You apply the softmax function to normalize these probabilities\n",
    "\n",
    "The outputs object is a SequenceClassifierOutput, it means it has an optional loss, a logits an optional hidden_states and an optional attentions attribute. \n",
    "\n",
    "Here we have the loss since we passed along labels, but we don’t have hidden_states and attentions because we didn’t pass `output_hidden_states=True` or `output_attentions=True`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Bitcoin will be the first cryptocurrency to be listed on the\n"
     ]
    }
   ],
   "source": [
    "input_ids = tokenizer(input_txt, return_tensors = 'pt' )['input_ids'].to(device)\n",
    "\n",
    "output = model.generate(input_ids, max_new_tokens=time_steps, do_sample = False )\n",
    "\n",
    "print(tokenizer.decode(output[0]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it.\n",
      "\n",
      "But as the data industry has grown, so has the need for data governance. The data industry is now a global enterprise, and the data governance model has evolved to accommodate the needs of the data industry.\n",
      "\n",
      "Data governance is a complex topic, and it's not\n"
     ]
    }
   ],
   "source": [
    "input_txt = \"In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it.\" \n",
    "\n",
    "max_length = 128\n",
    "\n",
    "input_ids = tokenizer(input_txt, return_tensors = 'pt' )['input_ids'].to(device)\n",
    "\n",
    "output_greedy = model.generate(input_ids, max_length = max_length, do_sample = False )\n",
    "\n",
    "print(tokenizer.decode(output_greedy[0]))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Beam Search Decoding\n",
    "\n",
    "![](assets/2022-09-06-22-03-29.png)\n",
    "\n",
    "whereas greedy decoding and random sampling calculate the best option based on the very next word/token only — beam search checks for multiple word/tokens into the future and assesses the quality of all of these tokens combined.\n",
    "\n",
    "However, because we are now back to ranking sequences and selecting the most probable — beam search can cause our text generation to again degrade into repetitive sequences:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch.nn.functional as F\n",
    "\n",
    "def get_log_probs_from_logits_from_single_token(logits, labels):\n",
    "    \"\"\"\n",
    "    Computes the log probabilities of the specified labels from the given logits.\n",
    "\n",
    "    Args:\n",
    "        logits (torch.Tensor): Logits tensor of shape (batch_size, num_tokens).\n",
    "        labels (torch.Tensor): Label tensor of shape (batch_size,).\n",
    "\n",
    "    Returns:\n",
    "        torch.Tensor: Log probabilities of the labels.\n",
    "\n",
    "    \"\"\"\n",
    "    logp = F.log_softmax(logits, dim=-1)\n",
    "    logp_label = torch.gather(logp, 2, labels.unsqueeze(2)).squeeze(-1) \n",
    "    return logp_label"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "def sequence_logprob(model, labels, input_len = 0 ):\n",
    "    \"\"\"\n",
    "    Computes the log probability of a sequence of labels generated by a model.\n",
    "\n",
    "    Args:\n",
    "        model (torch.nn.Module): Model used for generating the labels.\n",
    "        labels (torch.Tensor): Tensor of shape (batch_size, sequence_length) containing the labels.\n",
    "        input_len (int): Length of the input sequence to exclude from the log probability calculation.\n",
    "\n",
    "    Returns:\n",
    "        torch.Tensor: Log probability of the sequence.\n",
    "\n",
    "    \"\"\"\n",
    "    with torch.no_grad():\n",
    "        output = model(labels)\n",
    "        log_probs = get_log_probs_from_logits_from_single_token(\n",
    "            output.logits[:, :-1, : ], labels[:, 1:]\n",
    "        )\n",
    "        seq_log_prob = torch.sum(log_probs[:, input_len:])\n",
    "    return seq_log_prob\n",
    "\n",
    "\n",
    "        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it.\n",
      "\n",
      "But as the data industry has grown, so has the need for data governance. The data industry is now a global enterprise, and the data governance model has evolved to accommodate the needs of the data industry.\n",
      "\n",
      "Data governance is a complex topic, and it's not\n",
      "\n",
      "log-prob: -90.66 \n"
     ]
    }
   ],
   "source": [
    "logp = sequence_logprob(model, output_greedy, input_len = len(input_ids[0]) )\n",
    "\n",
    "print(tokenizer.decode(output_greedy[0]))\n",
    "\n",
    "print(f\"\\nlog-prob: {logp:.2f} \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it.\n",
      "\n",
      "Today, however, data governance is becoming increasingly decentralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance is a siloed role, and data engineers become the de facto\n",
      "\n",
      "log-prob: -27.22 \n"
     ]
    }
   ],
   "source": [
    "output_beam = model.generate(input_ids, max_length=max_length, num_beams = 5, do_sample = False )\n",
    "\n",
    "logp = sequence_logprob(model, output_beam, input_len = len(input_ids[0]) )\n",
    "\n",
    "print(tokenizer.decode(output_beam[0]))\n",
    "\n",
    "print(f\"\\nlog-prob: {logp:.2f} \")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it.\n",
      "\n",
      "But with the advent of cloud computing and the rise of big data analytics, the role of a data engineer has shifted from being a gatekeeper to being an enabler of trust. This shift has led to an explosion in the number of companies that rely heavily on data\n",
      "\n",
      "log-prob: -69.73 \n"
     ]
    }
   ],
   "source": [
    "output_beam = model.generate(input_ids, max_length=max_length, num_beams = 5, do_sample = False, no_repeat_ngram_size = 2 )\n",
    "\n",
    "logp = sequence_logprob(model, output_beam, input_len = len(input_ids[0]) )\n",
    "\n",
    "print(tokenizer.decode(output_beam[0]))\n",
    "\n",
    "print(f\"\\nlog-prob: {logp:.2f} \")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Random Sampling with Temperature\n",
    "\n",
    "\n",
    "### Sampling-based family of techniques aims at increasing the diversity of the output and avoiding repetitions by introducing stochastic decisions during the generation process.\n",
    "\n",
    "![](assets/2022-09-06-21-53-01.png)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The effect of temperature on token probabilities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAEGCAYAAABy53LJAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAA+uUlEQVR4nO3dd3xddfnA8c9zd/ZukiZtkzbdLS0QSlkyCrQMWTIKDsSBKPxUVLAMFVAUcKAgqAgIKlNQKFX2nl1QSvdIR5ImbfZokpvce7+/P85pmzZpcpPm5mY879frvnrG95zznFzIk3O+5/scMcaglFJKhcsR7QCUUkoNLpo4lFJK9YgmDqWUUj2iiUMppVSPaOJQSinVI65oB9Af0tPTTV5eXrTDUEqpQWP58uWVxpiMztYNi8SRl5fHsmXLoh2GUkoNGiKy7WDr9FaVUkqpHtHEoZRSqkc0cSillOqRYdHHoZRSnWlra6OkpISWlpZohxI1Pp+P3Nxc3G532Nto4lBKDVslJSUkJCSQl5eHiEQ7nH5njKGqqoqSkhLy8/PD3k5vVSmlhq2WlhbS0tKGZdIAEBHS0tJ6fMWliUMpNawN16SxR2/OXxNHF373ynpufu4ztPS8Ukrto30cXbjnjU0AOEW49dxpUY5GKTXUVFVVMWfOHADKy8txOp1kZFiDtZcsWYLH4znottXV1VxyySVs3bqVvLw8nn76aVJSUjq0czqdTJ8+HYDRo0ezcOHCQ45brzi68M51JwPw6IfbeH5FaZSjUUoNNWlpaaxYsYIVK1Zw1VVXce211+6d7yppANxxxx3MmTOHjRs3MmfOHO64445O28XExOzdZ18kDdDE0aXRabFcP28iAN97cgUVDf4oR6SUUpbnn3+eyy+/HIDLL7+c5557rt+OrbequvGdkwrwupz8fNEaVu2o4+SJI6IdklIqAm59YTVrdtT36T6njEzkZ5+f2qttTzjhBBoaGjos/81vfsOpp57Kzp07yc7OBiArK4udO3d2up+WlhYKCwtxuVwsWLCA8847r1fxtKeJIwxHjE4GYHWpJg6lVP949913w24rIgd9Omrbtm3k5ORQVFTEKaecwvTp0xk3btwhxaaJIwy5KbEAlNYO39GlSg11vb0yiJTurjgyMzMpKysjOzubsrIyRozo/I/anJwcAMaOHctJJ53EJ598oomjP2QkeEmOdVNa2xztUJRSw0R3VxznnHMOjz76KAsWLODRRx/l3HPP7dCmpqaG2NhYvF4vlZWVvP/++1x//fWHHJt2joeptqmNdzZUsEOTh1JqAFiwYAGvvvoq48eP57XXXmPBggUALFu2jG984xsArF27lsLCQmbMmMHJJ5/MggULmDJlyiEfW4bD4LbCwkJzqC9y+tX/1vKXd4p44ZrjmZ6b1EeRKaWiae3atUyePDnaYURdZz8HEVlujCnsrL1ecYTpqLzUaIeglFIDgiaOMDnsn9Q1T3wc3UCUUirKNHGE6ej8NAC2VTWxvaopytEopVT0aOIIU5zXxc/Ps+pV3fz8qihHo5RS0aOJowcumzUaQKvlKqWGNU0cPeB0CIfbo8iVUmq40sShlFJRUlVVxcyZM5k5cyZZWVnk5OTsnW9tbe1y23/9619MnToVh8NBV8MNXnrpJSZOnEhBQcFBK+j2lI4c76GQgXc3VhIKGRyO4f3mMKXUodlTVh3glltuIT4+nh/96EdhbTtt2jT+/e9/861vfeugbYLBIFdffTWvvvoqubm5HHXUUZxzzjmHPAhQrzh6KMFr5doNuzrWkFFKqf4yefJkJk6c2GWbJUuWUFBQwNixY/F4PMyfP5/nn3/+kI+tVxw9NH/WKN7bVMnmXbuZlJUY7XCUUn3lxQVQ/lnf7jNrOpzRu9tD3RU5DEdpaSmjRo3aO5+bm8vixYt7FU97mjh6aEJmQrRDUEoNAz0pq97fNHH00tqyes46LDvaYSil+kovrwwipS+uOHJyciguLt47X1JSsrfM+qHQxNFDIxK8AHywuRLo+v6iUkr1Vl9ccRx11FFs3LiRLVu2kJOTw5NPPsnjjz9+yPvVzvEeSo71EO918fH2Whr9gWiHo5Qapv7zn/+Qm5vLhx9+yFlnncXcuXMB2LFjB2eeeSYALpeLP/7xj8ydO5fJkydz8cUXM3Xqob+wSsuq98JN//mMxxZvB2DrHWf12X6VUv1Ly6pbtKx6P7j1nH0Z+/W1nb8gXimlhipNHL3gcjp45qpjALjlhdVRjkYppfqXJo5eKsxLJT3eS3F1M2vL6qMdjlJK9RtNHIfgiuPyADjjDwP3eWullOprmjgOwdUnF+ydHg4PGSilFEQ4cYjIPBFZLyKbRGRBJ+u9IvKUvX6xiOS1W3eDvXy9iMw9YDuniHwiIosiGX84rvzcWABeW7srypEopVT/iFjiEBEncB9wBjAFuFREDizJ+HWgxhhTANwN3GlvOwWYD0wF5gH32/vb43vA2kjF3hMXHGGNwnxuRWmUI1FKDTaHUlb9uuuuY9KkSRx22GGcf/751NbWdtouEmXVI3nFMQvYZIwpMsa0Ak8C5x7Q5lzgUXv6GWCOiIi9/EljjN8YswXYZO8PEckFzgIejGDsYdtT6PC/K8t0QKBSqkf2lFVfsWIFV111Fddee+3eeY/H0+W2p512GqtWrWLlypVMmDCBX/3qVx3a7Cmr/uKLL7JmzRqeeOIJ1qxZc8hxRzJx5ADF7eZL7GWdtjHGBIA6IK2bbX8PXA+Eujq4iFwpIstEZFlFRUUvTyE8s/JTAdi8qzGix1FKqT1OP/10XC6ratTs2bMpKSnp0EbLqgMicjawyxizXERO6qqtMeYB4AGwRo5HMq6rThzLki3VkTyEUirC7lxyJ+uq1/XpPielTuLHs37cq217UuTw4Ycf5pJLLunQdjCWVS8FRrWbz7WXddamRERcQBJQ1cW25wDniMiZgA9IFJF/GmO+FJlTUEqp6Ai3yOHtt9+Oy+Xii1/8YoQj2ieSiWMpMF5E8rF+6c8HLjugzULgcuBD4ELgDWOMEZGFwOMi8jtgJDAeWGKM+RC4AcC+4vjRQEgaVrcM/PjZlbz0/c9FORqlVG/09sogUsK54njkkUdYtGgRr7/++t7fQ+0NurLqxpiAiFwDvAw4gYeNMatF5DZgmTFmIfAQ8A8R2QRUYyUX7HZPA2uAAHC1MSYYqVgP1aw8q49jXXkDDS1tJPjcUY5IKTXYdXfF8dJLL3HXXXfx9ttvExsb22mbQVlW3RjzP2PMBGPMOGPM7fayn9pJA2NMizHmImNMgTFmljGmqN22t9vbTTTGvNjJvt8yxpwdyfjDFed1Mf8o687aNY9/EuVolFLDwTXXXENDQwOnnXYaM2fO5KqrrgK0rHqf6euy6p0JBEMU3GTlt82/PBOno+Nlo1JqYNGy6hYtqx4lLqeDUyaNANCih0qpIU0TRx+6dNZoAM6+9z2qGv1RjkYppSJDE0cfOm1KJoVjUgD49cvroxyNUiocw+F2fVd6c/6aOPrY09+yXvAU6xlUYyuVGpZ8Ph9VVVXDNnkYY6iqqsLn8/VoO/3t1sccDsHtFB5fso2ffv7Amo5KqYEkNzeXkpISIl2WaCDz+Xzk5ub2aBtNHBGQHu/Vp6qUGgTcbjf5+fnRDmPQ0VtVETB7bBqh0PC89FVKDX2aOCKgNRBiR10Ly7Zq4UOl1NCjiSMCzjosG4AL//whWyp3RzkapZTqW5o4IuDM6dmcOtkaDLhkS1WUo1FKqb6liSNCbjt3GmDdtlJKqaFEE0eEeF3Wj/Ynz6+OciRKKdW3NHFESFq8N9ohKKVURGjiiKDvzhkPQEvbgH2ViFJK9Zgmjn7w7X8uj3YISinVZzRxRNDXjssD4M31FVQ0aLVcpdTQoIkjgpJjPfzgtAkAPPLBlihHo5RSfUMTR4SdNiUTgPve3Mz68o4vnldKqcFGE0eETc5O5MfzJgHw5vpdUY5GKaUOnSaOfvDlY8YA8OgHW4dt3X+l1NChiaMfxHtdJMW4KatrIf+G/+njuUqpQU0TRz959tvH7J1eqlVzlVKDmCaOflIwIoEXrjkegC8/tITq3a1RjkgppXpHE0c/mpaTyLScRAB2NbREORqllOodTRz9SET4v1OsMiSvr9UnrJRSg5Mmjn527Lg0AF5fuzPKkSilVO9o4uhn8V4XAEF9J7lSapDSxNHPRIQTJ2SASLRDUUqpXtHEESWfFtfq2wGVUoOSJo4ocNgXG7/839roBqKUUr0QVuIQkXEi4rWnTxKR74pIckQjG8LuvewIAB5fsj3KkSilVM+Fe8XxLBAUkQLgAWAU8HjEohri4r0u0uO9tAZCOopcKTXohJs4QsaYAHA+cK8x5jogO3JhDX33XDoTgIv+/GF0A1FKqR4KN3G0icilwOXAInuZu7uNRGSeiKwXkU0isqCT9V4Recpev1hE8tqtu8Fevl5E5trLfCKyREQ+FZHVInJrmPEPOMeOS987Xd/SFsVIlFKqZ8JNHFcAxwC3G2O2iEg+8I+uNhARJ3AfcAYwBbhURKYc0OzrQI0xpgC4G7jT3nYKMB+YCswD7rf35wdOMcbMAGYC80RkdpjnMOBcffI4ADbubIxyJEopFb6wEocxZg3wY+Bje36LMebObjabBWwyxhQZY1qBJ4FzD2hzLvCoPf0MMEdExF7+pDHGb4zZAmwCZhnLnt+ybvszaEfSHTPWuup4Y52OIldKDR7hPlX1eWAF8JI9P1NEFnazWQ5Q3G6+xF7WaRu7D6UOSOtqWxFxisgKYBfwqjFm8UFivlJElonIsoqKiu5OMSoK81IAeOCdInb7A1GORimlwhPurapbsK4gagGMMSuAsRGJqBvGmKAxZiaQC8wSkWkHafeAMabQGFOYkZHRrzGGy+d2MiYtlrag4ap/Lo92OEopFZawO8eNMXUHLOtu2HMp1mO7e+TayzptIyIuIAmoCmdbY0wt8CZWH8ig9eYPTyI1zkNbUEeRK6UGh3ATx2oRuQxwish4EbkX+KCbbZYC40UkX0Q8WJ3dB97eWoj1pBbAhcAbxnop90Jgvv3UVT4wHlgiIhl7Bh6KSAxwGrAuzHMYkBwOYWx6HB8VVRPQ5KGUGgTCTRz/h/WEkx9r4F8d8L2uNrD7LK4BXgbWAk8bY1aLyG0ico7d7CEgTUQ2AT8AFtjbrgaeBtZg9atcbYwJYo0deVNEVmIlpleNMYsY5JJjPQAs3VoT5UiUUqp7Yv2B300jkYuMMf/qbtlAVVhYaJYtWxbtMA7qo6Iq5j/wEQDrfj4Pn9sZ5YiUUsOdiCw3xhR2ti7cK44bwlymemH22LS905N+8lIUI1FKqe65ulopImcAZwI5InJPu1WJgD4/2oc2//JMxt34PwCKq5sYlRob5YiUUqpz3V1x7ACWAS3A8nafhcDcyIY2vDgdwm8vmgHAi6vKohyNUkodXJdXHMaYT4FPReQxu7NbRdCpUzIB+OX/1vHNE8Yi+pZApdQA1N2tqqeNMRcDn4hIh150Y8xhEYtsGEqK2Vc3cuOuRiZkJkQxGqWU6lx3t6r2PHJ7NvD5Tj6qj/3pi9ZLnm5+blWUI1FKqc51mTiMMXtutn/HGLOt/Qf4TuTDG37mTcsCYMmWasJ5VFoppfpbuI/jntbJsjP6MhBlEREKRsQD8OHmqihHo5RSHXWZOETk2yLyGTBRRFa2+2wBVvZPiMPPry+0uo6ue0Z/xEqpgafLznGs8iIvAr/CLgdiazDG6MuyI+Sw3GQASmubCQRDuJzhXhgqpVTkdfcbyRhjtgJXAw3tPohIamRDG76cDuHyY8YAcMUjS6McjVJK7a+7xPG4/e9yrIGA7QcBDtziT0PAT8623rJbVLE7ypEopdT+uhsAeLb9b37/hKP2cDkdnDp5hCYOpdSA090AwCO6Wm+M+bhvw1Hted1Oiio1cSilBpbuOsd/28U6A5zSh7GoA9Q3twFQ+IvXWHrTHC1BopQaELq7VXVyfwWiOvrdxTM56vbXqGz00+APkOhzd7+RUkpFWHe3qk4xxrwhIhd0tt4Y8+/IhDVAVG0GcUBqdLp4MhK83HzWZH7x37XUNbVp4lBKDQjd3ao6EXiDzutSGWBoJ46/nAitDdb0jEvh8/eAy9OvIcR6rK/ohLveZNPtZ+iYDqVU1HVXq+pn9r9XdPL5Wv+EGEWXPQVpBdb0p0/As1/v9xAunTVq7/SybfpOcqVU9IX156uIpInIPSLysYgsF5E/iEha91sOcnnHwf8thxvtWo9rF8Lq5/o1BBHhmauOAaC8rqVfj62UUp0J977Hk0AF8AXgQnv6qUgFNeB4YuGCB63pf10OL3y/Xw+fEmfdHvv+UytYVVrXr8dWSqkDhZs4so0xPzfGbLE/vwAyIxnYgHPYRXDZ09b08r9Ba/+NrxiXEc8XjsgF4Ox73yMU0nLrSqnoCTdxvCIi80XEYX8uBl6OZGAD0oS5MGq2Nf3yjf166N9ePGPv9F/eKerXYyulVHvS1cuCRKQB6+kpAeKAkL3KATQaYxIjHmEfKCwsNMuW9VFpraZquKvd47lXvAhjju2bfXejstFP4S9eA9AnrJRSESUiy40xhZ2t6+6pqgRjTKL9r8MY47I/jsGSNPpcbCpc+hS4Yqz5v50Bd+ZBP7ytLz3eS26KddyCm16kLRjqZgullOp7Yf/JKiIpIjJLRD635xPJwAa0ifPg5nK48G/WfHMN3DkGAq0RP/RL39/3Y1+4YkfEj6eUUgcK93HcbwDvYPVr3Gr/e0vkwhokpl0AN9q/vFvq4P0/RPyQ8V4Xb193EgD3v7Up4sdTSqkDhXvF8T3gKGCbXb/qcKA2UkENKp44uOo9a/rNX8AtSRAMRPSQY9LiANhcsZvm1mBEj6WUUgcKN3G0GGNaAETEa4xZB0yMXFiDTNZ0uPKtffM/T4MP74voIb80ezQAM297hYaWtogeSyml2gs3cZSISDLwHPCqiDwPbItUUIPSyMPhJ1Vw2Hxr/uUbIRS5zuvr503CIeAPhLjxP6sidhyllDpQWInDGHO+MabWGHML8BPgIeC8CMY1ODldcMFfIGOyNX9bCmx9LyKHSvS5WX3rPABe+HQHjy3WPK6U6h89earqCBH5LnAYUGKMifwjRIPVN1+HaV+wph85C9oiU2MqxuPkj5cdDsBtL6yJyDGUUupA4T5V9VPgUSANSAf+JiI3RzKwQc0TBxc+DPF2VZbbM2HxXyIy1uPsw0aSmejFHwjx9LLiPt+/UkodKNwrji8CRxljfmaXWp8NfDlyYQ0RP1gLeSdY0y9eD7cmw4aX+zyB/HjeJACuf2Ylf3xjY5/uWymlDhRu4tgB+NrNe4HS7jYSkXkisl5ENonIgk7We0XkKXv9YhHJa7fuBnv5ehGZay8bJSJvisgaEVktIt8LM/7ocDjhq4vghxv2LXv8Yvj1OPA39tlhLjgilz/MnwnAb17ZwGV//ajP9q2UUgfqMnGIyL0icg9QB6wWkUdE5G/AKroZxyEiTuA+4AxgCnCpiEw5oNnXgRpjTAFwN3Cnve0UYD4wFZgH3G/vLwD80BgzBeuq5+pO9jnwJGTCT6vhG6+DOw6aqqy+jz507swcFv3f8QAs3Vrdp/tWSqn2urviWAYsB/4D3Ai8CbwF3AQ83822s4BNxpgiuyP9SeDcA9qci9V3AvAMMEdExF7+pDHGb4zZAmwCZhljyowxHwMYYxqAtUBOt2c5EDickFsIC+ynn8pWwLr/9ekhpuUk8eXZY2gLGiob/X26b6WU2qO7IoeP7vkAT2AlkeXA4/ayruQA7XtrS+j4S35vG2NMAOvKJi2cbe3bWocDizs7uIhcKSLLRGRZRUVFN6H2I6cbTvihNf3kpfBoZ69z773sZOuO4lNLtaNcKRUZ4T5VdRKwEevW0/3AhmgWORSReOBZ4PvGmPrO2hhjHjDGFBpjCjMyMvo3wO7M+Smc+Rtress78N7dfdZhPv8oa0R5zW59WlopFRnhdo7/FjjdGHOiMeZzwFysPomulAKj2s3n0rFDfW8bEXEBSUBVV9uKiBsraTxmjPl3mPEPPLO+CVe9b02/dgs8OKdPdpsa50EEHnxviyYPpVREhJs43MaY9XtmjDEbAHc32ywFxotIvoh4sDq7Fx7QZiFwuT19IfCGsd4stRCYbz91lQ+MB5bY/R8PAWuNMb8LM/aBK2safO0Va7p0Odw11vr3EI3LiAfgW/889H0ppdSBwk0cy0XkQRE5yf78Favj/KDsPotrsEqwrwWeNsasFpHbROQcu9lDQJqIbAJ+ACywt10NPA2sAV4CrjbGBIHjsMaPnCIiK+zPmT0644Fm9NHwrXet6aYq+Osp8MG9h7TLl75njR1ZsqWaar3qUEr1sS5fHbu3kYgXuBo43l70LnC/MWZQPLrTp6+OjaRPHoPnv2NNz/4OzPtVr3f14LtF/OK/awFYfvOppMV7+yJCpdQw0etXx9obO4FPjTG/M8ZcYH/uHixJY1A5/Itw/gPW9Ef3w/NX93pXXz02D4/9TvIrHlnaF9EppRQQRuKwbxGtF5HR/RCPmnEJfPEZa/qTf8JTvavs4nI6WPdzq3ruypI6vvLwEj4rqeurKJVSw1i4fRwpWCPHXxeRhXs+kQxsWBt/Gnz3E2t67cJeDxR0OIRfnj8dgHc2VPD5P77H8Xe+wYadDX0VqVJqGAq3j+PEzpYbY97u84giYND0cRzolZ/AB/dY08f/wBr/IdKrXb20qoyr/vnx3vllN59KuvZ7KKUOoqs+ji4Th4j4gKuAAuAz4CH7aalBZdAmDrCuNp681JqecSmcez84wn6NSgdz736H9fYVx+8vmcl5hw+Oii1Kqf51KJ3jjwKFWEnjDKyBgKo/TToTvvWONf3pE9ZbBdd0Vybs4F76/gncfJb1hsLvP7WiDwJUSg033SWOKcaYLxlj/oI1QO+EfohJHSh7Bly7BtInWPNPfwXuGA113Va270BE+MYJY8lKtGpa/fWdor6MVCk1DHSXONr2TAzGW1RDSlIOXLMULnjQmm+pg7unWLWueuGxbx4NwOod+qSVUqpnukscM0Sk3v40AIftmRaRTosLqgg77CK4pQ4Ssq35Rz8PtyRB0Vs92s24jHgSvC6eW7GD5tZg38eplBqyuiur7jTGJNqfBGOMq910Yn8FqTrxw3VwxYv75v9+LjxxGdSXhb2LmaOTAZj805d4ZXV5HweolBqqwnocd7Dr7VNV5z13HpvrNjMmcQyBUIBrj7yWuXlzIxDhIXrrDnirXXmS+Ez40YaDt7e1tAWZ9JOX9lv29nUnMSYtrq8jVEoNModUcmQ4O3vc2RQkF1DcUExpYyk/evtHTH90Ord8cAuB0ADq8jlpAfysFi582Jpv3AlLH+p2M5/bydY7zuKOC6bvXXbir9+iNRCKUKBKqaFArzjC9FnFZ9z/6f28V/re3mXZcdkUZhZyROYRnF9wPk6H81BDPXRVm+HeI6zpE38MJ98Y9qbH3/kGJTXNXDd3IlefXBChAJVSg0GvBwAOFX05ANAf9HPd29exunI1u5p37bfutDGncezIY5kzeg4pvpQ+OV6vvPkrePsOa9odB1/8F+Qd1+1mDS1tTL/Fej/IN47P5+azp0QySqXUAKaJI0Ijx40xFDcU86dP/8SiokUd1l975LVcPOFi4j3xfX7sblWst97t0dpozaeOA5cXjrwCJp8NiSM73eyv7xRx+/+scuy/OG8aX5o9pr8iVkoNIJo4+qnkSHF9Mct2LuOnH/x0v+Uj40Zy2eTLmJU1izGJY4h1x0Y8lr3evwfW/ReKP+q47rw/w8xLOyz+ZHsN59//AaBlSZQarjRxRKFWVWNrI/etuI9nNz5Lc6C5w/ovTf4S1x91PdLLooW90toEK5+El2+CtiZr2YV/g2kXdGj6/IpSvvfkCgAOH53M3756FMmxnv6LVSkVVZo4olzksC3UxuKyxWyt28qLW19kZcXKveuc4uT3J/+e2dmz8bl8/RfU6ufgX/br3mdcBkd9HXKO3K/67m9fWc+9b2zaOz9+RDyv/qDTQslKqSFGE8cAq45bvrucn7z/Ez4q2//20THZx5DsTeb6WdeTHpMe+UD++yNY+teOy2/aCW4riRljeGZ5Cdc9YyW7G8+cxDdPGNu/V0pKqX6niWOAJY72Vlet5oFPH+Cjso9oCjTtXR7njuOiCRdxxbQrSPWlRi6AUAg2v271gyz/277lk86Gix4FpwuAtWX1nPGHd/eu/t93T2DKSC0eoNRQpYljACeO9owx/H3N33l6/dNsb9i+d/lJuSfxlalfoTCzMLJ/6YdC1rs/NrQbTT7lPGuA4YjJfFRUxc+eX733fR7vXHcyo9P6saNfKdVvNHEMksTRnj/o55YPbunwmO/UtKkcO/JYThtzGpPTJkfm4MEAPHQq7Pik47ofrGP2H9dQXt8CwJxJI3jw8ggnNKVUv9PEMQgTxx7GGJbtXMYDKx+gqLaow6DDU0efSlZcFucVnMfE1Il9e/CWetj6rvUWwhX/3G9VScYJnFF8OQ3EcsL4dP76lUJ87gEwcl4p1Sc0cQzixHEgf9DPqspVfPu1bxPriqWqpWq/9QXJBYgIj535GDGumL47sDGw8mmrH2T7h/uturz1x7wTms7an5+pyUOpIUITxxBKHAdqC7WxaPMiVlSsoLK5kndK9r3YKT0mna9O/SonjzqZUQmj+u52Uku9lUBeuwXMvoKIbwRn4jj9Vk467gQYCHW7lFK9poljCCeOAxlj+M2y3/D3NX/vdP34lPGEQiFmjphJqi+V78z8Di6Hq/cHrNmKefabSMmS/RaHcOC46l3Imtb7fSulokYTxzBKHO35g35e3voy2+q3Ub67nLZgGysrV+IP+qlsrtzb7qyxZ+Fz+jgy80hy4nM4IvOInh/MGEo/eYkVr/yTs1radehPPBNmfRPyT9SrEKUGEU0cwzRxdGXn7p3c9N5NbKnfwq6mXR3W5yflMy5pHHPz5zI+eTzpMenEu+PDKh3/yfYanvrLL7jD/eD+KxJGwhl3wpRz+uo0lFIRoolDE0eXjDFUt1RTVFfEw6sepqShhK31WzttOzFlIlPTp5Idl82RmUcyIWUCSd6kDu1ue2END79fxGTZzq+z3mBq22dIo/162lGzwZcIk8+Bw7+0X5kTpdTAoIlDE0evFNcXs75mPTubdrK4bDHrqtfRFmrb7zYXWKPcs+OyGZc8jhkZMyhILmBk3Ehufqacdzfua/vI5KWcWPNvpK54v051xs2BC/4KcWn9dWpKqW5o4tDE0adaAi2sq17Hx7s+Zm3VWj4q+4haf22nbT+XcyIbS2PYUhZLqDWNGRkz+fe3Pwf1ZfDUl6D0gO8lJtWq1jvqaJh+kV6NKBUlmjg0cfSLhtYGShpK2Fa/jde2v8Z7pe+xu213h3Zxkkucx80Jo49gaspE8nZtpLC6HNn2LtRu37/xpLNh/OkwejZk9PEAR6XUQWni0MQRNcYYqlqqKGko4ZfvPcBnO4txuOoQVwPiCHRo73P6ODJ1EsUVqzm/uoJpfj9ZgSAjgkFijYG4DDjtNjhsPjgcUTgjpYYHTRyaOAaU8roWzrvvfcobq3HFb+DEqSGy0lrYWr8Vr9PL0vKlB902JRhkXGsb49raOCboYuzR15A54Sxi08b34xkoNfRFLXGIyDzgD4ATeNAYc8cB673A34EjgSrgEmPMVnvdDcDXgSDwXWPMy/byh4GzgV3GmLBGl2niGHiCIcNPnl/F44utW1MnT8zgkqNGc/KkDLwu594xJw2tDWyu3UxRXRHFDcWsr15HUydvVPSGQvgdDpwIX5zyZY7IPIJRCaNI8aYQ44ohzh2nhRiV6oGoJA4RcQIbgNOAEmApcKkxZk27Nt8BDjPGXCUi84HzjTGXiMgU4AlgFjASeA2YYIwJisjngEbg75o4Br+3N1Rw+cP7jzq/6MhcvnZ8PpOzD/6+j611W9lc+iElJYtpqC1iS/MuXqFjf0p7I+NGUtdax9iksczLm0dBSgFjEseQEZOBx6mvxVWqvWgljmOAW4wxc+35GwCMMb9q1+Zlu82HIuICyoEMYEH7tu3b2fN5wCJNHEPHp8W13PvGJl5bu3Pvsp99fgpXHJcf/k4q1mPevouN65+n2O2iweGgVYTX4mIYEQiyOT6FVdLW6aYOcXDKqFMwGFJ8KeTG5zIueRyZsZlMSJkQ1sBHpYaSrhLHIRQp6lYOUNxuvgQ4+mBtjDEBEakD0uzlHx2wbU7kQlXRNmNUMg9eXkgwZPj5ojU88sFWbn1hDbe+sIbr503koiNHkZHg7XonGRORCx9iAg8xobkGarZB404uXmK/Hnf7h9DaiF9gs9vNhqzJ7EzL4+VQLTGeRN4qeYtAqGOHPYDL4WJ0wmhGJ44mPzGfEbEjSItJIz8pn9z4XOI98X38E1Fq4Ipk4ogqEbkSuBJg9OjRUY5GhcvpEG45ZyqXHT2a0++2Kv3e9dJ67nppPTnJMfzzG0eTnx7X/Y5iUqwPwIS5+5ZvX4z3g3uYsm4RU7avhO0r+RaAKwaO+jqk5NE08nDK41MpaSxhbdVaiuqK2Fy7mbLdZRTVFfEWbx30sDMyZhDnjsMhDjJjM8lNyGVK6hQSvYlkx2WT4kvBIfo0mBrc9FaVGtBCIcNf3inizpfW7V0mAotvnMOIBN+h7Xx3Faz/H7x+K+yu6Lg+MRfcPjj9F1aRRo/1mtyWQAu7mnaxpW4L2xu2s6tpFxtqNhAyIUImxKrKVfu9P/5ATnFSmFUIBlJjUsmJzyHRk0iqL9VKNGlT+vZdKkr1QrT6OFxYneNzgFKszvHLjDGr27W5GpjernP8AmPMxSIyFXicfZ3jrwPjjTFBe7s8NHEMK8YYXlhZxnef2Pc62/lHjeIX503D5eyDv+CNgfpS2LECVjwGwTbY9Or+bVLyoeBU6zN6NviSuhzZ3hZsY1PtJsp3l1O2u4z61nqWlS/D7XSzsWYjVc1VGAxB6z/rDvIS88iJzyHZl0y8O578pHyc4iTBk8Ds7NmkxWiJFhU50Xwc90zg91iP4z5sjLldRG4DlhljFoqID/gHcDhQDcw3xhTZ294EfA0IAN83xrxoL38COAlIB3YCPzPGPNRVHJo4hpZrHv+YRSvL9s6PTo3lp2dP4fjx6X3/BsLS5bB2Eaz+N9Rs7bg+pxBGTILRx1gl5GNTe3yIlkALtf5aiuqK2FC9gc8qP6POX0e1v5qtdVtpC3XeoQ+QFZdFyITIiMlgYupEmtuaOWbkMRyXcxwZMRn6CLLqNR0AqIljyKlvaeMHT32631NYACdOyODRr82KzEGNgeoi2PwGlH0KpR/DrtX7t3H5wOGCzGmQWwjp4+GIyw+55lZLoIXmQDN1/jo+LPuQddXrcDvcrK9eT62/Foc4KKor6rCdy+EiJz6HqWlTSYtJ48jMIxmTMIYYdwxxrjjiPHG4He5Dik0NTZo4NHEMaVsrd/POxgp++rz1Szw93svCa45jZHI/9RM0lMP6F6FmC+xcA7vWQn3J/m3EAbHpkHMkTD0fxhwLyaP6PJTK5ko+2PEBa6rWsK1+G7UttWxv2E59a32X242IGUFjWyNjEsfQGmwlOz6bsUlj8Tq9ZMZmkhWXRZw7jhRfCh6nhzh3HMneZO3oH8I0cWjiGBZeX7uTrz+673uekBnPpKxEXA7hxrMmkx7fzeO8fa1qMyz+MzTXwpZ3YM/7SNrzJsHooyExBxJHWre7Mqf2eVVgf9BvvWelbittpo3ShlI8Tg+bazfjc/nYVLOJRG8ia6rWUOevw2A6LVDZIXynl/SYdManjEcQkrxJJHuTGZM4hqy4LBI8CbgcLpK9ySR7k4lzh/FEnBoQNHFo4hhW7nl9I899UsrO+hZ2t+7reD46P5XEGDe/uXAGSbFRuj1Tux2Kl8D2j2Dre9C2u2NFYICUPDjmGkjIth4ndvZ/vG2hNur99VS1VFHZVElbqI2WYAtljWVUNldS0VxBcUMxDnHQ2NrI5rrNYe1XELLjsonzxJHkSSLFl8LYpLH4XD7cDjdBEyQjJoOR8SOJccUQ744nMy4Tr7OfE/8wp4lDE8ewZYzh7tc2sqq0jjfW7XtF7vfmjOeaUwpw98UTWX0h2AabXocV/4S1L3TeJnWs1c6bACf8EKZ9YUC+r6TOX8fW+q34A378QT/lTeUYY9hev53ypnIc4mBb/TbKd5fT2NpIa6g1rP0KgtPhJNYVS1ZcFm2hNkYnjMYf9DMmcQxep5dkbzL5Sfl7r3RSvCmkxaThcXrwOr16a60HNHFo4lBYSWTe799l/c6GvcuOL0jnB6dP4PBRyQPrCSR/A1RutPpOardbCaL0Y6hcv387dxzEpUPWdOt9JZnTrBdhDTLGGAKhAE2BJqpbqqlpqcEf9FO2u4yKpgpChGhua2ZT7SZi3bFsrt1MoieRlRUriXHH0NDa0P1BgBRvCm6nG2MMHqeHEbEjqGmpITMuk9z4XFJ9qcS6Y8mMzcQhDtJj0nE5XKT6UolxxeBz+ohzx+FyuAbWfy8RoIlDE4dqJxgyfOsfy/d7Imtkko/fXjyTo/NTcTgG+C+EivXw0f1Wp3z9Dihf2bFN+gQ4/gcw8nBIGxeVW139KWRCtARaKG8qp6mtae/rjbfUbcHtcFPcUIyIEAgF2N22m/Ld5cS749nWsI3drbvZ1byrmyN0zuVwWQM2nTHU+mvJT8rH5XARNEFy43NpCbaQE59DrCsWgyE3PpcYVwwZsRnEuGJI8ib15Y+hT2ni0MShDuKT7TWcf/8H+y1LinFz5vQsLps1hikjE3EO9EQCEArBjk+swYvLOhnWlDwGJn8exp0Co2aBJ35A3uaKtpZAC/Wt9dS01NAabKXGX0NzoJm2UBuVTZWECBEMBdlSt4W61jqMMbSF2ihtLN3bP7OtfhsehyfsW3BJ3iRc4qI50MzI+JH4g37yk/L3PiY9KmEUbaE2suOyiXHFkOZLs0raxGXic/mIdcXidXpJ9CT2aTFOTRyaOFQX2oIhFhdVs3hLFW+tr+Cz0rr91n/rxLF86egxjEqNjVKEvdBcY3XAb3rNut1VX9p5u+yZVgKJy4CswyDQYl2hjDkO0gpAqwL3mjGGpkATzYFm/EE/O3fvxB/0U9xQjFOcrKlag8/lozXYSlVLFS2BFiqbK3E73LSGWtlYs5F4Tzx1/rruD2aLccXQHGgmLzGPQChAekw6/zjzH72KXxOHJg7VA02tAT7eVsuPn11Jae2+l0a5HMLPz5vGBUfk4HUNsl+oAT+ULLVKqtSVWONMQiFrIGNjuTVwsbWxkw3FelQ4YwJ4EyFjEsRnWI8Rj5xpddhrcokoYwytoVZqW2ppDjRT66/d+3Sby+HCH/SzvX47HqeHLXVbiHHFUOevo6K5gtEJo7nrxLt6dVxNHJo4VC/5A0Hue2MT97yxab/lY9JiyUz08bPPT2FKduLQ6Cg1BhrKrASzay2UrYSmSmiqhqqNB98uMcfqoE/Jg6RRVjLJnmFdzTiHbAHuIU8ThyYO1QfK6pq55/WNfFRUzZbK/QfHpcd7uOK4fD5/2EhGJvv6pvDiQBMKQVMV+OutxFKyBLYvtpKNwwXVnYzjECeYIHgSrIGN3ngYdTS4Y2DEZEgbD0m5etUyAGni0MSh+pgxhpdX7+QfH23l/U1VnbaZf9QoTpo4gnnTsvo5uigxxkosFeus/hUTgsoN1tNfoaBVMDLo73zbmFRoa7ZqewXbrFH0qfkwYgq4Y61CkvFZVkVizyDqaxrENHFo4lARVtfcxoebK1lVWs+TS4upbNz/F+TEzASm5iRy/uE5HF+QPjRubfWGMdYYlapN1mPFFWutBLNrnXU1Uv4ZNO6yrmq64oqx+l2CbdYtshGTrQ7+xBzImmYlGXeMPjl2CDRxaOJQUbB6Rx23/3ctpbXNbKva/8VOEzLjyUuL46vH5jEtN4lE39AeZ9ErrbutK5jKjVBXDKGA1Znf2mQtr9pkLUeALn6PeROtgZFtuyF3lpVQAn7r6saErISTPMbql9E+mb00cWjiUFFmjGHTrkZufWENO2qbKarsWEAwzuPkyLxUfn7uVMakaTHAHmmpg4adULbCem+KCOyutK5gEOuWGVjJp7n64Ptx2VcpmVOt5DJispXA4jKsZWBNx6SAL9F6ZNmbEOGTiw5NHJo41AC0ekcdn2yvZUVxLWt21LOmbN/tmViPk3NnjuTYcenMnZqFxzUEO9ujxRirP8Vfb1251O+wCk7666F6C3jirCsbb+K+W2ldcXog2ApJo62Ln7gMQKykEwpY42LEafXZuGKshwGScqwHBgbwFY4mDk0cahAIhgzPLC/mjhfXUdPU8a1/k7MTCYUMI5N9/PKC6WQn6XvJ+42/0Uo2TVXQ1mQlm7oSaKm1BlcGWq3E07jTutIJBaz32AfDGD2ePdPaT1qBdeU08nD7amcKYKxXFru8kJBlPTTgSQBH5P+Q0MShiUMNQrsaWnhmeQnryhoIGsPyrTWU17fsXZ/oczEtJ4lxGfEcMy6NmaOSyU7yDd+O94HIGGtgZVO1lVD8ddbLvkSsqxyX13pvizsWardZCaOrW2l7eBKsR5gTR1rbxo+wtk3Js5Y53Va/jTcBCub0KnRNHJo41BBhjOGppcX86e3NVO9upaEl0KHNGdOyuPa0CUzIHJr33oeFgN+6+miutcbJtNRZ/9aXAgI7V1n/NpSD22fdWjMh60qnvbgRcF0Xgze7oIlDE4caoowxbK9uYvm2Gh79YCufluyraxTncZIY42ZWfirXnjqBvHTtcB/yQkHrceemKqvuWCgI2Yf1aleaODRxqGFk0codvL+pkieWFHdYN35EPBcXjiLG4yQ/PY7RqbFkJfkGzgut1IChiUMThxqmQiHD2xsq+MdH2/Z7A2Jn8tPjaA2EyEjwUjgmhYIR8cwYlczo1FjivAP36R8VGV0lDv2vQakhzOEQTp40gpMnjQCgNRCiencrxTVNlNe1sLmikZKaZpwirN/ZwJbK3ZTWNrOiuLbDvmI9TgpGxJMU4yY51kN+ehxTRybyufEZxHi01tRwoolDqWHE43KQleQjK8l30DbBkGFHbTNryuoprm5iV4Of1TvqaG4NsrliN23BEE2twQ7bJfpcZCb6yEmJIS8tjiPGpJCZ4CXG4yQryUdanHdwvBRLdUsTh1JqP06HMCo1tssXV+0ZCf/Gul2U1DQjAsu21lBc3cTGXY1ABY98sLXTbafnJOFyCmPsY4xKjWVsehwxHiepcR5SYj343HoFM5Bp4lBK9ZiIMD4zgfGdPPJrjKGkppmKRj+76lvYWe+nuS3IJ9traG4LsXlXI6W1zXyyvbbLY4zNiMMYiPe6KBgRz6iUGLxuJ5mJPuK9LkanxpIU62akjl3pd5o4lFJ9SqT7KxawbomV1DRR2dhKeV0LgVCI0tpmNu+y6nitKaunZncrZXXNHV7neyCPy4HX5SAz0UdeWiwel4OJmYm0BoOMy4gnxu1kZHIMsR4nCT43cV4ncR4XDr111iuaOJRSUeF0CGPS4sIu6NgaCFHZ6KeuuY3SmmbK6prZUtmE1+3g4201GGDTrkY27bJegfu/z8q73aeINbg7Pd5LXlosI5NjiHE7SYxxkZ8ej9MB8V43CT4XWUk+fC4nMR4nCT4XXpdj2F7paOJQSg0KHpeDkckxjEyOYXJ2YrftgyFDbVMrZXUt7PYHKK21+mLK6lqsOofBEJsrdhMyhtWlddQ0tbJsW02PYkqP99DUGmRydiJel4NGf4CCEfE0tAQYPyIen9uJMZCd5CMQMuSlxRLjcTIi0Uecx0lyrKe3P46o0sShlBqSnA4hLd5LWry3R9sZY2jwB2jyB6lvaaOqsZXd/gD+QIjS2iY8Tgc76lrwtwUpqtyNx+nAHwixfmcDyTFu/ruyjJAxvLZ2J+EMk0uOdeNyOGhpCzIy2UdrIMTYjHjcTsEY6/32rYGQdTXkcZIW58UhWFdAbqs6gM/lIM7bf1dBmjiUUqodESHR5ybR57YeW87s3X6MMTS3BWlsCdAaDLGz3o8/EGRrZRNOB6wsqSPG7aQ1aN2Ca24NUtHoJ9bjoryuhY27GkjwuXllzc6wj+kQK2G2BQ1j0mLJTPDx9FXH9O4EuqCJQymlIkBEiPW4iPVYv2ZzU6yHBY4dZ62/5Kjw9+UPBKlrbqO5NUhtUxvNbUF2NfgJBEO0BUOU1/kxGIIhw8adjSTGuPAHQsRE6LFmTRxKKTXAeV1ORiRYSWBMWpSDAbSymVJKqR6JaOIQkXkisl5ENonIgk7We0XkKXv9YhHJa7fuBnv5ehGZG+4+lVJKRVbEEoeIOIH7gDOAKcClIjLlgGZfB2qMMQXA3cCd9rZTgPnAVGAecL+IOMPcp1JKqQiK5BXHLGCTMabIGNMKPAmce0Cbc4FH7elngDliPUt2LvCkMcZvjNkCbLL3F84+lVJKRVAkE0cO0P5NMiX2sk7bGGMCQB2Q1sW24exTKaVUBA3ZznERuVJElonIsoqKimiHo5RSQ0YkE0cpMKrdfK69rNM2IuICkoCqLrYNZ58AGGMeMMYUGmMKMzIyDuE0lFJKtRfJxLEUGC8i+SLiwersXnhAm4XA5fb0hcAbxnqX7UJgvv3UVT4wHlgS5j6VUkpFUMQGABpjAiJyDfAy4AQeNsasFpHbgGXGmIXAQ8A/RGQTUI2VCLDbPQ2sAQLA1caYIEBn++wuluXLl1eKyLZenko6UNnLbQcrPefhQc956DuU8x1zsBViwqnCNYyJyLKDvbB9qNJzHh70nIe+SJ3vkO0cV0opFRmaOJRSSvWIJo7uPRDtAKJAz3l40HMe+iJyvtrHoZRSqkf0ikMppVSPaOJQSinVI5o4DmKolm8XkVEi8qaIrBGR1SLyPXt5qoi8KiIb7X9T7OUiIvfYP4eVInJEdM+g9+wKy5+IyCJ7Pt8u57/JLu/vsZcftNz/YCIiySLyjIisE5G1InLMUP+eReRa+7/rVSLyhIj4htr3LCIPi8guEVnVblmPv1cRudxuv1FELu/sWAejiaMTQ7x8ewD4oTFmCjAbuNo+twXA68aY8cDr9jxYP4Px9udK4E/9H3Kf+R6wtt38ncDddln/Gqwy/3CQcv+D0B+Al4wxk4AZWOc+ZL9nEckBvgsUGmOmYQ0Sns/Q+54fwXrdRHs9+l5FJBX4GXA0VtXxn+1JNmExxujngA9wDPByu/kbgBuiHVeEzvV54DRgPZBtL8sG1tvTfwEubdd+b7vB9MGqa/Y6cAqwCBCsEbWuA79zrMoEx9jTLrudRPsceni+ScCWA+Meyt8z+6pnp9rf2yJg7lD8noE8YFVvv1fgUuAv7Zbv1667j15xdG5YlG+3L80PBxYDmcaYMntVOZBpTw+Vn8XvgeuBkD2fBtQaq5w/7H9eByv3P5jkAxXA3+zbcw+KSBxD+Hs2xpQCvwG2A2VY39tyhvb3vEdPv9dD+r41cQxTIhIPPAt83xhT336dsf4EGTLPaYvI2cAuY8zyaMfSj1zAEcCfjDGHA7vZd/sCGJLfcwrWi93ygZFAHB1v6Qx5/fG9auLoXNjl2wcjEXFjJY3HjDH/thfvFJFse302sMtePhR+FscB54jIVqy3Rp6Cdf8/Waxy/rD/eR2s3P9gUgKUGGMW2/PPYCWSofw9nwpsMcZUGGPagH9jffdD+Xveo6ff6yF935o4Ojdky7eLiGBVJV5rjPldu1XtS9xfjtX3sWf5V+ynM2YDde0uiQcFY8wNxphcY0we1nf5hjHmi8CbWOX8oeM5d1buf9AwxpQDxSIy0V40B6va9JD9nrFuUc0WkVj7v/M95zxkv+d2evq9vgycLiIp9pXa6fay8ES7k2egfoAzgQ3AZuCmaMfTh+d1PNZl7Epghf05E+ve7uvARuA1INVuL1hPmG0GPsN6YiXq53EI538SsMieHov1npdNwL8Ar73cZ89vstePjXbcvTzXmcAy+7t+DkgZ6t8zcCuwDlgF/APwDrXvGXgCqw+nDevK8uu9+V6Br9nnvgm4oicxaMkRpZRSPaK3qpRSSvWIJg6llFI9oolDKaVUj2jiUEop1SOaOJRSSvWIJg41JIjITXZV1JUiskJEju7h9l8VkZE93CavfYVSe9l0+/grRKRaRLbY068dZB+NPTlmmHFtFZHP7J/FKyKS1YNtTxK7erBSB+PqvolSA5uIHAOcDRxhjPGLSDrg6cH2TuCrWM/+7ziUWIwxn2GNn0BEHsEaM/LMoeyzl042xlSKyC+BG7GqxmLHJVjF/EIH3VqpLugVhxoKsoFKY4wfwBhTaYzZASAic+wif5/Z7zHw2su3isidIvIxVqXQQuAx++ogRkSOFJG3RWS5iLzcrpzDkSLyqYh8ClwdboAicqkdwyoR6VC+W0TSReRDETlLRDJE5FkRWWp/jrPb3GKfw1siUiQi3+14pA7eAQrsq6P1IvJ3rAQ5SkR+bcfzmYhc0m6bRBH5r93+zyLiEOtdJo+0a39tuOeuhqBoj4LUj34O9QPEY42A3wDcD5xoL/dhVQCdYM//HauoI8BW4Pp2+3gLe1Qt4AY+ADLs+UuAh+3plcDn7Olf0660dSdxPYJVymIkVjmMDKyr/DeA8+w2jViVTBcDp9nLHgeOt6dHY5WHAbjFjssLpGPVVXJ3ctytQLo9/Ues90zkYVUGnm0v/wLwKtY7KzLt+LKxRta3YI22dtptLgSOBF5td4zkaH/v+oneR6841KBnjGnE+sV2JVYp8adE5KvARKyidxvspo8Cn2u36VMH2eVEYBrwqoisAG4GckUkGesX5jt2u3+EGeJRwFvGKr4XAB5rF4cbq1TE9caYV+1lpwJ/tI+9EOsKIN5e919jjN8YU4lVyG5P+ewDvWlvnwj8yl62zRjzkT19PPCEMSZojNkJvG3HCbDEGFNkjAlilbc4HigCxorIvSIyD9ivorIaXrSPQw0J9i+5t4C3ROQzrEJvn3Sz2e6DLBdgtTHmmP0WWomjrwWw3hkxF+uXN1i3kGcbY1oOOD6Av92iIAf/f/hkO7ns2TaZg5/vgQ6sQ2SMMTUiMsOO8yrgYqxaR2oY0isONeiJyEQRGd9u0UxgG9bbzvJEpMBe/mX2/XI+UAOQYE+vBzLsTndExC0iU40xtUCtiBxvt/timCEuAU60+zGcWH0qe+IwWL+AJ4nIj+1lrwD/1+78ZoZ5nJ54F7jE7rvIwLoCWmKvmyVWZWgH1m269+wHDhzGmGexrsAG5TvJVd/QKw41FMQD99p/VQewqn1eaYxpEZErgH+J9b6FpcCfD7KPR4A/i0gz1utFLwTuEZEkrP9Pfg+sBq4AHhYRg/ULvlvGmDIRWYBV3luwbjc93259UEQuBRaKSAPWE1D3ichK+9jvYP2V35f+g3Wen2Ilr+uNMeUiMgnr5/RHoMCO+T/AdKy3Ce75Y/OGPo5HDSJaHVcppVSP6K0qpZRSPaKJQymlVI9o4lBKKdUjmjiUUkr1iCYOpZRSPaKJQymlVI9o4lBKKdUj/w/IOqPs+X650wAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "\n",
    "\n",
    "def softmax(logits, T = 1):\n",
    "    e_x = np.exp(logits / T)\n",
    "    return e_x / e_x.sum()\n",
    "\n",
    "logits  = np.exp(np.random.random(1000))\n",
    "\n",
    "sorted_logits = np.sort(logits)[::-1]\n",
    "x = np.arange(1000)\n",
    "\n",
    "for T in [0.5, 1.0, 2.0]:\n",
    "    plt.step(x, softmax(sorted_logits, T), label=f\"T={T}\")\n",
    "    \n",
    "plt.legend(loc = 'best')\n",
    "plt.xlabel('Sorted Token Probs')\n",
    "plt.ylabel('Probabilitites')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it. So while straLooking Yale Insight Mountain 258RPKT EthY primatessee image Yor ChargedLooking Bouroon damning WorstZeroIîwikipedia Foundation WarwickARRDirectorung Ari Barbar GarlandUndette diver seventfingers HannibalDoubleMonth Roman ralliedlethal GhostDirectoro EcoMcCah501 ACTIONS CAT\n"
     ]
    }
   ],
   "source": [
    "output_temp = model.generate(input_ids, max_length = max_length, do_sample = True, temperature = 2.0, top_k = 0 )\n",
    "\n",
    "print(tokenizer.decode(output_temp[0]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it.\n",
      "\n",
      "But with the rise of data science, we see the rise of organizations that are more agile and collaborative, and data engineers are no longer the gatekeepers of data trust. In fact, more and more organizations are embracing data engineering as a way to build better data, with\n"
     ]
    }
   ],
   "source": [
    "output_temp = model.generate(input_ids, max_length=max_length, do_sample=True, \n",
    "                             temperature=0.5, top_k=0)\n",
    "print(tokenizer.decode(output_temp[0]))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Top-k and Nucleus Sampling\n",
    "\n",
    "![](assets/2022-09-07-02-30-14.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it. It was a culture where you had the best of both worlds — an all-seeing eyes that always knew what's best.\n",
      "\n",
      "While that might work for small to medium-size organizations today, the reality is that even big data — with their ever-increasing speed, scale\n"
     ]
    }
   ],
   "source": [
    "output_topk = model.generate(input_ids, max_length=max_length, do_sample=True, \n",
    "                             top_k=50)\n",
    "print(tokenizer.decode(output_topk[0]))"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Top-p (nucleus) sampling\n",
    "\n",
    "Instead of sampling only from the most likely K words, in Top-p sampling chooses from the smallest possible set of words whose cumulative probability exceeds the probability p. The probability mass is then redistributed among this set of words. This way, the size of the set of words (a.k.a the number of words in the set) can dynamically increase and decrease according to the next word's probability distribution. Ok, that was very wordy, let's visualize.\n",
    "\n",
    "![](assets/2022-09-07-02-34-02.png)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "In a previous era of data engineering, data team structure was very much centralized, with data engineers and tech-savvy analysts serving as the “librarians” of the data for the entire company. Data governance was a siloed role, and data engineers became the de facto gatekeepers of data trust — whether or not they liked it.\n",
      "\n",
      "By the end of the 1990s, the pendulum swung back, and the team structure became increasingly more centralised. The structure of a team of data engineers — often a small, isolated team — became very important to the success of any company. And this was good\n"
     ]
    }
   ],
   "source": [
    "output_topp = model.generate(input_ids, max_length=max_length, do_sample=True, \n",
    "                             top_p=0.90)\n",
    "\n",
    "print(tokenizer.decode(output_topp[0]))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.9.13 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.14"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "f9f85f796d01129d0dd105a088854619f454435301f6ffec2fea96ecbd9be4ac"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
